在现实世界条件下运行的原因是由于部分可观察性引起的广泛故障而具有挑战性。在相对良性的环境中,可以通过重试或执行少量手工恢复策略之一来克服这种失败。相比之下,诸如打开门和组装家具之类的接触式连续操作任务不适合详尽的手工设计。为了解决这个问题,我们提出了一种以样本效率的方式来鲁棒化操作策略的一般方法。我们的方法通过在模拟中探索发现当前策略的故障模式,从而提高了鲁棒性,然后学习其他恢复技能来处理这些失败。为了确保有效的学习,我们提出了一种在线算法值上限限制(值UCL),该算法选择要优先级的故障模式以及要恢复到哪种状态,以使预期的性能在每个培训情节中最大程度地提高。我们使用我们的方法来学习开门的恢复技能,并在模拟和实际机器人中对其进行评估。与开环执行相比,我们的实验表明,即使是有限的恢复学习也可以从模拟中的71 \%提高到92.4 \%,从75 \%到90 \%的实际机器人。
translated by 谷歌翻译
传统上,启发式搜索一直依赖于手工制作或编程派生的启发式方法。神经网络(NNS)是更新的强大工具,可用于从州学习复杂的映射到成本到启发式方法。但是,他们缓慢的单个推理时间是一个很大的开销,可以在优化的启发式搜索实现中大大减少计划时间。最近的一些作品描述了利用NN的批处理计算的方法,以减少计划中的开销,同时保持(子)最优性的界限。但是,所有这些方法在建立批处理的同时以“阻止”方式使用了NN启发式方法,并且忽略了通常可以使用的快速计算可接受的启发式方法(例如现有的经典启发式启发术)。我们介绍了一种非阻滞批次A*(NBBA*),这是一种有界的次优方法,它懒洋洋地分批计算NN启发式方法,同时允许通过非NN启发式启发术告知扩展。我们展示了与当前的阻止替代方案相比,这种微妙但重要的变化如何导致扩展大幅减少,并看到该性能与计算出的NN和快速非NN启发式的批处理差异有关。
translated by 谷歌翻译
基于冲突的搜索(CBS)是一种流行的多试路径查找(MAPF)求解器,该求解器采用低级单位代理计划者和高级约束树来解决冲突。绝大多数现代MAPF求解器都专注于通过各种策略减少这棵树的大小来改善CB,几乎没有修改低级计划者的方法。现有CBS方法中的所有低级计划者都使用未加权的启发式启发式方法,次优的CBS方法还使用冲突启发式启发式启发式来帮助高级搜索。与普遍的信念相反,我们表明,通过以特定方式加权冲突,可以更有效地使用启发式成本的启发式。我们介绍了这样做的两个变体,并证明这种变化在某些情况下可以导致2-100倍的加速。此外,据我们所知,我们展示了优先规划和有限的次优的CB的第一个理论关系,并证明我们的方法是它们的自然概括。
translated by 谷歌翻译
我们考虑使用最低限度的努力与人类机器人团队一起完成一组$ n $任务的问题。在许多领域中,如果有许多任务有限的任务,教机器人完全自主可能会适得其反。相反,最佳策略是权衡教授机器人及其好处的成本 - 它允许机器人自动解决多少新任务。我们将其作为规划问题提出,目的是确定机器人应自动执行的任务(ACT),应将哪些任务委派给人类(委托)以及应教授机器人的哪些任务(学习)以完成所有给定的任务都以最小的努力。这个计划问题导致搜索树以$ n $成倍增长 - 使标准图形搜索算法难以理解。我们通过将问题转换为混合整数程序来解决这个问题,该程序可以使用固定求解器有效地解决解决方案质量的范围。为了预测学习的好处,我们提出了一个先进的预测分类器。给定两个任务,该分类器预测接受培训的技能是否会转移到另一个。最后,我们在模拟和现实世界中评估了有关PEG插入和乐高堆叠任务的方法,显示了人类努力的大量节省。
translated by 谷歌翻译
迭代学习控制(ILC)是在存在建模误差中的高性能跟踪的强大技术,以获得最佳控制应用。在化学反应器,工业机器人和Quadpopters等应用中,有广泛的现有工作表明其经验效果。然而,即使在存在大型建模错误的情况下,也有很少的现有理论工作,即使在大型建模错误的情况下,也可以在存在大型建模错误中,其中使用错过模型(mm)的最佳控制方法经常表现不佳。我们的工作提出了ilc和mm对线性二次调节器(LQR)问题的表现的理论研究,具有未知的过渡动态。我们表明,对于ILC的最佳LQR控制器测量的次优差间隙低于MM的高阶术语在高建模误差的方案中变得显着的比例低于MM。我们分析的一个关键部分是有限地域设置中离散Ricatti方程的扰动界限,其中解决方案不是一个固定点,并且需要使用递归界限跟踪错误。我们将我们的理论调查结果与具有近似模型的玩具线性动力系统的实验实验,一个非线性倒立摆动系统,具有错过质量的非线性倒立摆动系统,以及风的非线性平面正质量。实验表明,根据计算轨迹的成本,ILC在模拟误差高时显着优于MM显着。
translated by 谷歌翻译
增量图诸如D * Lite重用之前的算法,并且可能部分搜索,以加快后续路径规划任务。在本文中,我们有兴趣开发增量图搜索算法,以便寻找问题,同时优化旅行风险,到达时间等的多个目标。这是具有挑战性的,因为在多目标设置中,“帕累托 - 最优” “解决方案可以对图表的大小呈指数级增长。本文提出了一种新的多目标增量搜索算法,称为基于多目标路径的D * Lite(MOPBD *),它利用基于路径的扩展策略来修剪主导的解决方案。此外,我们介绍了MOPBD *的两个变体,以进一步提高搜索效率,并近似帕累托最优的前沿。我们在数值上评估了MOPBD *及其在各种地图中的变体的性能,其中包括两个和三个目标。结果表明,我们的方法比从头开始搜索的方法更有效,并且比多目标路径规划的现有增量方法快速升至幅度速度快。
translated by 谷歌翻译
Traditional planning and control methods could fail to find a feasible trajectory for an autonomous vehicle to execute amongst dense traffic on roads. This is because the obstacle-free volume in spacetime is very small in these scenarios for the vehicle to drive through. However, that does not mean the task is infeasible since human drivers are known to be able to drive amongst dense traffic by leveraging the cooperativeness of other drivers to open a gap. The traditional methods fail to take into account the fact that the actions taken by an agent affect the behaviour of other vehicles on the road. In this work, we rely on the ability of deep reinforcement learning to implicitly model such interactions and learn a continuous control policy over the action space of an autonomous vehicle. The application we consider requires our agent to negotiate and open a gap in the road in order to successfully merge or change lanes. Our policy learns to repeatedly probe into the target road lane while trying to find a safe spot to move in to. We compare against two model-predictive control-based algorithms and show that our policy outperforms them in simulation.
translated by 谷歌翻译
We investigate a model for image/video quality assessment based on building a set of codevectors representing in a sense some basic properties of images, similar to well-known CORNIA model. We analyze the codebook building method and propose some modifications for it. Also the algorithm is investigated from the point of inference time reduction. Both natural and synthetic images are used for building codebooks and some analysis of synthetic images used for codebooks is provided. It is demonstrated the results on quality assessment may be improves with the use if synthetic images for codebook construction. We also demonstrate regimes of the algorithm in which real time execution on CPU is possible for sufficiently high correlations with mean opinion score (MOS). Various pooling strategies are considered as well as the problem of metric sensitivity to bitrate.
translated by 谷歌翻译
The body of research on classification of solar panel arrays from aerial imagery is increasing, yet there are still not many public benchmark datasets. This paper introduces two novel benchmark datasets for classifying and localizing solar panel arrays in Denmark: A human annotated dataset for classification and segmentation, as well as a classification dataset acquired using self-reported data from the Danish national building registry. We explore the performance of prior works on the new benchmark dataset, and present results after fine-tuning models using a similar approach as recent works. Furthermore, we train models of newer architectures and provide benchmark baselines to our datasets in several scenarios. We believe the release of these datasets may improve future research in both local and global geospatial domains for identifying and mapping of solar panel arrays from aerial imagery. The data is accessible at https://osf.io/aj539/.
translated by 谷歌翻译
There has been a great deal of recent interest in learning and approximation of functions that can be expressed as expectations of a given nonlinearity with respect to its random internal parameters. Examples of such representations include "infinitely wide" neural nets, where the underlying nonlinearity is given by the activation function of an individual neuron. In this paper, we bring this perspective to function representation by neural stochastic differential equations (SDEs). A neural SDE is an It\^o diffusion process whose drift and diffusion matrix are elements of some parametric families. We show that the ability of a neural SDE to realize nonlinear functions of its initial condition can be related to the problem of optimally steering a certain deterministic dynamical system between two given points in finite time. This auxiliary system is obtained by formally replacing the Brownian motion in the SDE by a deterministic control input. We derive upper and lower bounds on the minimum control effort needed to accomplish this steering; these bounds may be of independent interest in the context of motion planning and deterministic optimal control.
translated by 谷歌翻译